Large Scale Structure of the Universe
This notebook is interactive, and we will use it to learn and explore the distribution of galaxies in the universe. We follow a similar approach as other notebooks you might have used, in particular the one written by Gautham Narayan.
Feel comfortable to ask questions as you go along!
The webpage you are in is actually an app - much like the ones on your cellphone. This app consists of cells.
An input cell looks like a light grey box with an In [ ]: on its left. Input cells each contain code - instructions to make the computer do something.
To activate or select a cell, click anywhere inside of it.
# Text that follows a "#" is known as a comment.
# Comments do not affect your code in any way.
# You should always read the comments at the top of each cell you interact with.
# Comments will be used to describe what the cell's code is actually doing.
To execute or run a selected cell, hit [Shift + Enter] on your keyboard.
# Text that DOESN'T follow a "#" is considered code.
# Lines of code are instructions given to your computer.
# The line of code below is a "print" statement.
# A print statement literally prints out the text between its quotes.
print("Congrats! You have successfully run your first cell!")
Congrats! You have successfully run your first cell!
Running a cell creates an output directly below it. An output can be some text, a graph, an interactive slider, or even nothing at all! For that last case, you know you have run a cell when the In [ ]: becomes In [#]:, where "#" is any number.
You can learn more about how Jupyter notebooks work at https://try.jupyter.org/
In order for any of the activities to work properly, you must import the libraries needed for the code in this notebook.
"""Import libraries.
Here, you are importing the libraries needed for this notebook. These libraries set up the plotting environment in
your browser.
"""
import numpy as np
%matplotlib widget
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.path as mpath
from mpl_toolkits.mplot3d import Axes3D
from IPython.core.display import Image, Markdown, display
from ipywidgets import *
from astroML.datasets import fetch_sdss_S82standards, fetch_sdss_specgals, fetch_sdss_spectrum, fetch_vega_spectrum
from astroML.plotting import MultiAxes
print('Done! You have successfully imported the libraries.')
Done! You have successfully imported the libraries.
In this activity, you will learn how astronomers measure distances to galaxies. You will get to compare galaxies to figure out which ones are closer or further away from us. You can then use this method for many more galaxies on your own as well!
As in the first notebook, we are going to use data from the Sloan Digital Sky Survey (SDSS). This project used a telescope at Apache Point in New Mexico to look at the northern sky.
The Sloan survey team found millions of stars and galaxies, and made their big data set public. In this activity, we will retrieve and examine galaxy data!
So how did Sloan take spectra of millions of stars and galaxies? The team used metal plates like the one shown below, with a hundreds of holes aligned with the stars and galaxies to be observed. An optical fiber is placed in each hole in order to transfer the light to the instrument and camera. As you will see below, the data are identified by their Plate number, their Fiber number, and the date when they were obtained - the MJD (Modified Julian Date).
There were thousands of plates used (~2500 for SDSS), each with 640 fibers, which together gives 1.6 million spectra (including galaxies, stars, and extra spectra on blank sky).
A reference spectrum means that it is at redshift zero (not moving toward or away from us). In this case, the reference spectrum is that of a single star.
"""Fetch and plot the reference spectrum.
The "Plate", "MJD", and "Fiber" numbers of the reference spectrum are entered below. They are used to gather
wavelength and brightness data. That data is then plotted on a graph.
"""
# Fetch the reference spectrum.
plate = 396
mjd = 51816
fiber = 605
spec = fetch_sdss_spectrum(plate, mjd, fiber)
# Plot the reference spectrum.
fig = plt.figure()
ax = fig.add_subplot(111)
ax.set_title('Reference Spectrum')
ax.set_xlabel('Wavelength [Å]')
ax.set_ylabel('Relative Brightness')
ax.set_yticks([])
ln, = ax.plot(spec.wavelength(), spec.spectrum/spec.spectrum.max(), c='red', lw=0.2, label='Reference')
ax.legend()
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
# Create the interactive widget.
def f(w1):
ln.set_color(w1)
ln.set_label('Reference')
ax.legend()
layout = {'width':'initial'}
style = {'description_width':'150px'}
box_layout = {'display':'flex', 'flex_flow':'column', 'border':'5px solid grey', 'width':'initial'}
w1 = ColorPicker(description='Reference Color:', value='red', layout=layout, style=style)
interactive(f, w1=w1)
Box([w1], layout=box_layout)
Here, you will plot the spectrum of a galaxy. Look for similarities and differences in its shape and lines relative to the reference spectrum.
"""Fetch and plot the galaxy spectrum.
The "Plate", "MJD", and "Fiber" numbers of the galaxy spectrum are entered below. They are used to gather
wavelength and brightness data. That data is then plotted on a graph.
"""
# Fetch the galaxy spectrum.
plate = 2434
mjd = 53826
fiber = 359
spec1 = fetch_sdss_spectrum(plate, mjd, fiber)
# Plot the galaxy spectrum.
fig = plt.figure()
ax = fig.add_subplot(111)
ax.set_title('Galaxy Spectrum')
ax.set_xlabel('Wavelength [Å]')
ax.set_ylabel('Relative Brightness')
ax.set_yticks([])
ln, = ax.plot(spec1.wavelength(), spec1.spectrum/spec1.spectrum.max(), c='blue', lw=0.2, label='Galaxy')
ax.legend()
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
# Create the interactive widget.
def f(w2):
ln.set_color(w2)
ln.set_label('Galaxy')
ax.legend()
w2 = ColorPicker(description='Galaxy Color:', value='blue', layout=layout, style=style)
interactive(f, w2=w2)
Box([w2], layout=box_layout)
The next step here is to overlay a reference spectrum (called a template) onto the galaxy spectra from above.
Remember: A galaxy is a collection of billion of stars, so the shape of the spectrum is not identical to the reference spectrum of a single star. But because the stars have the same elements, notice similar "dips" in the spectra.
"""Plot the reference and galaxy spectra.
The reference and galaxy spectra are plotted side by side. They can be compared analytically so that the galaxy
spectrum's redshift value can be determined.
"""
# Plot the reference and galaxy spectra.
fig = plt.figure()
ax = fig.add_subplot(111)
ax.set_title('Reference vs. Galaxy Spectra')
ax.set_xlabel('Wavelength [Å]')
ax.set_ylabel('Relative Brightness')
ax.set_yticks([])
ln1, = ax.plot(spec.wavelength(), spec.spectrum/spec.spectrum.max(), c=w1.value, lw=0.2, label='Reference')
ln2, = ax.plot(spec1.wavelength(), spec1.spectrum/spec1.spectrum.max(), c=w2.value, lw=0.2, label='Galaxy')
ax.legend()
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
# Create the interactive widget.
def f(w1, w2, w3, w4, w5):
ln1.set_color(w1)
ln2.set_color(w2)
ln1.set_xdata(spec.wavelength() * (1 + w3))
ln1.set_ydata(spec.spectrum/spec.spectrum.max() + w4)
if w5 == 'On':
ax.relim()
ax.autoscale_view()
ln1.set_label('Reference')
ln2.set_label('Galaxy')
ax.legend()
w3 = FloatSlider(description='Redshift:', value=0, min=0, max=0.2, step=0.01, layout=layout, style=style)
w4 = FloatSlider(description='Vertical Separation:', value=0, min=-1, max=1, step=0.01, layout=layout, style=style)
w5 = ToggleButtons(description='Window Resizing:', options=['On', 'Off'], layout=layout, style=style)
interactive(f, w1=w1, w2=w2, w3=w3, w4=w4, w5=w5)
Box([w3, w4, w5, w1, w2], layout=box_layout)
This is what we saw as the redshift due to the expansion of the universe, which causes galaxies to appear to recede away from us.
The next step adds four new mystery galaxies to analyze. As with the one before, find the redshift of each of the four mystery galaxies. Compare your answers with the person next to you.
"""Plot the reference and mystery galaxy spectra.
Various mystery galaxy spectra are plotted next to the reference to showcase a variety of redshifts.
"""
# Fetch the mystery galaxy spectrum.
plate = 2121
mjd = 54180
fiber = 414
spec2 = fetch_sdss_spectrum(plate, mjd, fiber)
# Plot the reference and mystery galaxy spectra.
fig = plt.figure()
ax = fig.add_subplot(111)
ax.set_title('Reference vs. Mystery Galaxy 1 Spectra')
ax.set_xlabel('Wavelength [Å]')
ax.set_ylabel('Relative Brightness')
ax.set_yticks([])
ln1, = ax.plot(spec.wavelength(), spec.spectrum/spec.spectrum.max(), c=w1.value, lw=0.2, label='Reference')
ln2, = ax.plot(spec2.wavelength(), spec2.spectrum/spec2.spectrum.max(), c=w2.value, lw=0.2, label='Galaxy')
ax.legend()
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
# Create the interactive widget.
def f(w1, w2, w3, w4, w5, w6):
if w6 == 'Mystery Galaxy 1':
plate = 2121
mjd = 54180
fiber = 414
ax.set_title('Reference vs. Mystery Galaxy 1 Spectra')
if w6 == 'Mystery Galaxy 2':
plate = 1759
mjd = 53081
fiber = 126
ax.set_title('Reference vs. Mystery Galaxy 2 Spectra')
if w6 == 'Mystery Galaxy 3':
plate = 1839
mjd = 53471
fiber = 310
ax.set_title('Reference vs. Mystery Galaxy 3 Spectra')
if w6 == 'Mystery Galaxy 4':
plate = 2121
mjd = 54180
fiber = 523
ax.set_title('Reference vs. Mystery Galaxy 4 Spectra')
spec2 = fetch_sdss_spectrum(plate, mjd, fiber)
ln2.set_xdata(spec2.wavelength())
ln2.set_ydata(spec2.spectrum/spec2.spectrum.max())
ln1.set_color(w1)
ln2.set_color(w2)
ln1.set_xdata(spec.wavelength() * (1 + w3))
ln1.set_ydata(spec.spectrum/spec.spectrum.max() + w4)
if w5 == 'On':
ax.relim()
ax.autoscale_view()
ln1.set_label('Reference')
ln2.set_label('Galaxy')
ax.legend()
w3 = FloatSlider(description='Redshift:', value=0, min=0, max=0.2, step=0.01, layout=layout, style=style)
w4 = FloatSlider(description='Vertical Separation:', value=0, min=-1, max=1, step=0.01, layout=layout, style=style)
w5 = ToggleButtons(description='Window Resizing:', options=['On', 'Off'], layout=layout, style=style)
galaxies = ['Mystery Galaxy 1', 'Mystery Galaxy 2', 'Mystery Galaxy 3', 'Mystery Galaxy 4']
w6 = ToggleButtons(description='Mystery Galaxy:', options=galaxies, layout=layout, style=style)
interactive(f, w1=w1, w2=w2, w3=w3, w4=w4, w5=w5, w6=w6)
Box([w6, w3, w4, w5, w1, w2], layout=box_layout)
Now, let's check the redshift and learn more information about these galaxies. Click this link. Click "Search" on the left hand side menu bar, and then enter the Plate, Fiber and MJD for any one of the mystery galaxies above (find these in the code), and hit "Go". If you click on the image, you can move around, zoom in and out - it's like Google Maps for the night sky!
Let's see how far away the four mystery galaxies are compared with each other by placing them on the "redshift ruler" on the white board at the front of the room (if applicable).
Well done! You have measured redshifts for two galaxies, which is how astronomers determine distances to galaxies. Remember, the further away a galaxy is from us, the faster is seems to be moving away from us, and the more its light (spectrum) is redshifted! That's because the universe is expanding.
Next, let's see what we can learn if we apply this information to many galaxies. Onward to exploring our vast universe!
Similarly to using coordinates of latitude and longitude, the coordinates on the sky are defined onto a sphere. They are called RA (Right Ascension) and Dec (Declination). There are two illustrations below of these coordinate systems.
Next, we will fetch the positions of galaxies on the sky, and plot their RA and Dec coordinates. Run the cells below to actually fetch the galaxy sample and plot their positions on the sky.
"""Fetch and plot a section of the sky from the SDSS data."""
# Fetch the SDSS sata.
data = fetch_sdss_specgals()
print('Done retrieving the galaxy sample.\n')
# Define the coordinate variables for plotting.
RA = data['ra']
DEC = data['dec']
print('Range for RA values: %s %s' %(np.amin(RA),np.amax(RA)))
print('Range for DEC values: %s %s' %(np.amin(DEC),np.amax(DEC)))
# Convert RA range to [-180, 180] instead of [0, 360].
RA -= 180
print('Range for RA values after conversion: %s %s' %(np.amin(RA),np.amax(RA)))
# Plot the RA/DEC positions.
fig = plt.figure()
ax = fig.add_subplot(111, fc='black')
ax.set_title('SDSS Galaxies')
ax.set_xlabel('RA [degrees]')
ax.set_ylabel('DEC [degrees]')
ax.set_xlim(-8, 8) # Range for the x axis (horizontal)
ax.set_ylim(0, 16) # Range for the y axis (vertical)
ln, = ax.plot(RA, DEC, c='white', marker='.', ms=1, ls='', alpha=0.5)
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
Done retrieving the galaxy sample. Range for RA values: 0.0006706232093165454 359.99737558256095 Range for DEC values: -11.252827038865547 70.28739962843254 Range for RA values after conversion: -179.99932937679068 179.99737558256095
We saw before that in order to know the full distribution in 3D, we need to know how far away the galaxies are located. Here, we will add the information from the redshift.
Remember: The larger the redshift, the further away the galaxy!
First, we will plot all galaxies in white, and show galaxies that have approximately the same redshift in yellow. In this first example, we will select for values of redshift with a slider widget. This is done by computing galaxies in a window of $\pm 0.01$ around the value of the redshift slider. We call this an interval of redshift.
"""Highlight distant galaxies based on redshift."""
# Fetch the SDSS data.
data = fetch_sdss_specgals()
print('Done retrieving the galaxy sample.\n')
# Define the coordinate variables for plotting.
RA = data['ra']
DEC = data['dec']
# Convert RA range to [-180, 180] instead of [0, 360].
RA -= 180
# Define redshift variable z.
z = data['z']
# Select a redshift range to highlight.
rz = np.where(np.absolute(z) < 0.01)
# Plot the RA/DEC positions.
fig = plt.figure()
ax = fig.add_subplot(111, fc='black')
ax.set_title('SDSS Galaxies')
ax.set_xlabel('RA [degrees]')
ax.set_ylabel('DEC [degrees]')
ax.set_xlim(-8, 8) # Range for the x axis (horizontal)
ax.set_ylim(0, 16) # Range for the y axis (vertical)
ln1, = ax.plot(RA, DEC, c='white', marker='.', ms=1, ls='', alpha=0.5)
ln2, = ax.plot(RA[rz], DEC[rz], c='yellow', marker='.', ms=3, ls='', alpha=1) # rz selected galaxies
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
# Create the interactive widget.
def f(w3):
rz = np.where(np.absolute(z-w3) < 0.01)
ln2.set_data([RA[rz], DEC[rz]])
w3 = FloatSlider(description='Redshift:', value=0, min=0, max=0.2, step=0.01, layout=layout, style=style)
interactive(f, w3=w3)
Box([w3], layout=box_layout)
Done retrieving the galaxy sample.
Now, instead of showing just one interval of redshift in yellow, we will show the redshift of each galaxy color-coded. Each galaxy is shown with a dot, and each dot will have a color corresponding to the redshift: purple/blue colors mean a low redshift like between $0$ and $0.05$, then green/yellow mean slightly higher redshift like $0.1$, and so on until the higher redshift shown here of $0.2$ in red. Remember that this means that points with exactly the same color are at the same distance from us!
"""Color-code distant galaxies based on redshift."""
# Fetch the SDSS data.
data = fetch_sdss_specgals()
# Define the coordinate variables for plotting.
RA = data['ra']
DEC = data['dec']
# Convert RA range to [-180, 180] instead of [0, 360].
RA -= 180
# Define redshift variable z.
z = data['z']
# Plot the RA/DEC positions.
fig = plt.figure()
ax = fig.add_subplot(111, fc='black')
ax.set_title('SDSS Galaxies (Color-Coded by Redshift)')
ax.set_xlabel('RA [degrees]')
ax.set_ylabel('DEC [degrees]')
ax.set_xlim(-8, 8)
ax.set_ylim(0, 16)
ln = ax.scatter(RA, DEC, s=1, c=z, lw=0, cmap=cm.rainbow, vmin=0, vmax=0.2)
fig.colorbar(ln) # Color bar
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
# Count how many galaxies are within the plot.
points = np.column_stack([RA, DEC])
verts = np.array([[-8, -8, 8, 8], [0, 16, 16, 0]]).T
path = mpath.Path(verts)
points_inside = points[path.contains_points(points)]
print('-------------------------------------')
print('Number of galaxies in the plot above: %s' %np.count_nonzero(points_inside))
------------------------------------- Number of galaxies in the plot above: 48148
Next, you will make a 3D version of the above 2D plot (same dots but in 3D).
"""Plot distant galaxies based on redshift in 3D."""
# Fetch the SDSS data.
data = fetch_sdss_specgals()
# Define the coordinate variables for plotting.
RA = data['ra']
DEC = data['dec']
# Convert RA range to [-180, 180] instead of [0, 360].
RA -= 180
# Define redshift variable z.
z = data['z']
# Setting range limits.
limits = np.logical_and(
np.logical_and(
np.logical_and(RA>-8, RA<8),
np.logical_and(DEC>0, DEC<16)),
np.logical_and(z>0.0, z<0.2))
RA_new = RA[limits]
DEC_new = DEC[limits]
z_new = z[limits]
# Plot the RA/DEC positions.
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.w_xaxis.set_pane_color((0, 0, 0, 1))
ax.w_yaxis.set_pane_color((0, 0, 0, 1))
ax.w_zaxis.set_pane_color((0, 0, 0, 1))
ax.set_title('SDSS Galaxies (Color-Coded by Redshift)')
ax.set_xlabel('RA [degrees]')
ax.set_ylabel('DEC [degrees]')
ax.set_zlabel('Redshift (z)')
ln = ax.scatter(RA_new, DEC_new, z_new, zdir='z', s=1, c=z_new, cmap=cm.rainbow, vmin=0, vmax=0.2)
fig.colorbar(ln)
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
The color bar to the right-hand side shows the correspondence between color and redshift. As mentioned before, points with exactly the same color are at the same distance from us. Purple points are the closest to us, then blue, aqua, green and so on. Think about which galaxies/colors are near and which galaxies/colors are far.
Now, we will repeat the plots from Part 2.2 above, but with a zoom on a smaller region ("zooming in"), and then over a larger region ("zooming out").
"""Color-code distant galaxies based on redshift (zoomed-in)."""
# Fetch the SDSS data.
data = fetch_sdss_specgals()
# Define the coordinate variables for plotting.
RA = data['ra']
DEC = data['dec']
# Convert RA range to [-180, 180] instead of [0, 360].
RA -= 180
# Define redshift variable z.
z = data['z']
# Plot the RA/DEC positions.
fig = plt.figure()
ax = fig.add_subplot(111, fc='black')
ax.set_title('SDSS Galaxies (Color-Coded by Redshift)')
ax.set_xlabel('RA [degrees]')
ax.set_ylabel('DEC [degrees]')
ax.set_xlim(-2, 2)
ax.set_ylim(0, 4)
ln = ax.scatter(RA, DEC, s=5, c=z, lw=0, cmap=cm.rainbow, vmin=0, vmax=0.2)
fig.colorbar(ln) # Color bar
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
# Count how many galaxies are within the plot.
points = np.column_stack([RA, DEC])
verts = np.array([[-2, -2, 2, 2], [0, 4, 4, 0]]).T
path = mpath.Path(verts)
points_inside = points[path.contains_points(points)]
print('-------------------------------------')
print('Number of galaxies in the plot above: %s' %np.count_nonzero(points_inside))
------------------------------------- Number of galaxies in the plot above: 3102
Next, you will make a 3D version of the above 2D plot (same dots but in 3D).
"""Plot distant galaxies based on redshift in 3D (zoomed-in)."""
# Fetch the SDSS data.
data = fetch_sdss_specgals()
# Define the coordinate variables for plotting.
RA = data['ra']
DEC = data['dec']
# Convert RA range to [-180, 180] instead of [0, 360].
RA -= 180
# Define redshift variable z.
z = data['z']
# Setting range limits.
limits = np.logical_and(
np.logical_and(
np.logical_and(RA>-2, RA<2),
np.logical_and(DEC>0, DEC<4)),
np.logical_and(z>0.0, z<0.2))
RA_new = RA[limits]
DEC_new = DEC[limits]
z_new = z[limits]
# Plot the RA/DEC positions.
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.w_xaxis.set_pane_color((0, 0, 0, 1))
ax.w_yaxis.set_pane_color((0, 0, 0, 1))
ax.w_zaxis.set_pane_color((0, 0, 0, 1))
ax.set_title('SDSS Galaxies (Color-Coded by Redshift)')
ax.set_xlabel('RA [degrees]')
ax.set_ylabel('DEC [degrees]')
ax.set_zlabel('Redshift (z)')
ln = ax.scatter(RA_new, DEC_new, z_new, zdir='z', s=5, c=z_new, cmap=cm.rainbow, vmin=0, vmax=0.2)
fig.colorbar(ln)
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
Now, let's step back and plot galaxies over a large region of the sky!
"""Color-code distant galaxies based on redshift (zoomed-out)."""
# Fetch the SDSS data.
data = fetch_sdss_specgals()
# Define the coordinate variables for plotting.
RA = data['ra']
DEC = data['dec']
# Convert RA range to [-180, 180] instead of [0, 360].
RA -= 180
# Define redshift variable z.
z = data['z']
# Plot the RA/DEC positions.
fig = plt.figure()
ax = fig.add_subplot(111, fc='black')
ax.set_title('SDSS Galaxies (Color-Coded by Redshift)')
ax.set_xlabel('RA [degrees]')
ax.set_ylabel('DEC [degrees]')
ax.set_xlim(-16, 16)
ax.set_ylim(0, 32)
ln = ax.scatter(RA, DEC, s=1, c=z, lw=0, cmap=cm.rainbow, vmin=0, vmax=0.2)
fig.colorbar(ln) # Color bar
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
# Count how many galaxies are within the plot.
points = np.column_stack([RA, DEC])
verts = np.array([[-16, -16, 16, 16], [0, 32, 32, 0]]).T
path = mpath.Path(verts)
points_inside = points[path.contains_points(points)]
print('-------------------------------------')
print('Number of galaxies in the plot above: %s' %np.count_nonzero(points_inside))
------------------------------------- Number of galaxies in the plot above: 170274
Below, we will again plot the positions of galaxies, and include the information on redshift as the color (but with a different color scheme).
The difference with the steps above is that we will now plot the sample of galaxies over the full sky. The SDSS survey does not cover the full sky, so we will see what we call the "footprint" of the survey. This means the regions of the sky where the telescope was pointed to gather images and spectra.
"""Color-code distant galaxies based on redshift (full sky projection)."""
# Fetch the SDSS data.
data = fetch_sdss_specgals()
# Define the coordinate variables for plotting.
RA = data['ra']
DEC = data['dec']
# Convert RA range to [-180, 180] instead of [0, 360].
RA -= 180
RA *= np.pi / 180
DEC *= np.pi / 180
# Define redshift variable z.
z = data['z']
# Plot the RA/DEC positions.
fig = plt.figure()
ax = fig.add_subplot(111, projection='mollweide')
ax.set_title('SDSS DR8 Spectroscopic Galaxies')
ax.grid()
ln = ax.scatter(RA, DEC, s=1, c=z, lw=0, cmap=cm.rainbow, vmin=0, vmax=0.2)
fig.colorbar(ln)
fig.canvas.toolbar_position = 'bottom'
fig.tight_layout()
Hint: for more color maps, you can look at this reference page. For example, you can replace "rainbow" with "autumn_r".